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1. Introduction 

Si nee the micro-blogging service Twitter has been launched in 2006, it has 
become very popular among a wide community of users. It is estimated that 
as of 20 12 there are over 500 million users who generate approximately 350 
million posts per day. Public posts may be queried by an API provided by 
Twitter. About 1% of all posts are tagged with GPS coordinates. This makes 
them an interesting source for thefield of geographical analysis with volun- 
teered geographic information (VGI). Figure 1 and figure 2 visualise this 
data i n the area of Dresden. 
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Figure L Distribution of recorded georeferenced posts (German language only) in 
the area of Dresden 
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Figure 2. Bottom: A word cloud that visualises the contents of the posts within the 
given map extent (cf. also H ahmann & Burghardt 2011). The scaling of the words is 
accordi ng to thei r frequency of occurrence. 



2. VGI: Photos vs. Microblogging 

Within recent years, a major focus in the field of geographical analysis of 
VGI has been on Flickr photo data. However, a comparison (Figure3) of the 
temporal distributions of georeferenced Flickr photos and Twitter mi- 
croblogging posts ('Tweets") shows that there are significant differences in 
the temporal usage patterns between both services. It can be seen that both 
activities - taking photos and microblogging - have peaks at different 
times. While taking photos is an activity that strongly correlates with day- 
light periods, this trend cannot that clearly be observed for Twitter usage. 
The peaks withi n ti me for the date taken of photos are i n the afternoon, at 
the weekends- when more people have spare time during daylight periods 
- and during the summer months. Contrarily, the hourly distribution of 
Twitter usage has a peak during the late night hours. Moreover, Twitter 
usage is almost equally distri buted from Mondays to Sundays. 

This shows that both activities are at least partially done within different 
contexts of the contributing users' lives. This may have implications, if mi- 
cro-blogging contents are used as an alternative information source of VGI 
for geographical analyses. One application scenario within this field of re- 
search is the description and modeling of vague places (cf. e.g. J ones et al. 
2008, Hollenstein and Purves2010), or even more general: the description 
of the geospati al context of pi aces. 



Photos per Month of Year (January to Decembe 



Flickr Photos 



I 1 1 r~ 

1 2 3 



T 1 1 1 1 1 1 1 1 

4 5 6 7 8 9 10 11 12 

month of year (January=1, Decembers 2) 



Photos/Tweets per Day of Week (Monday to Sur 



Flickr Photos 
Tw itter Tw eets 



T 



T 



T 



4 5 6 

day of w eek (Monday=1 , Sunday=7) 

Photos/Tweets per Hour 



Flickr Photos 
Tw itter Tw eets 



' * » . 



i t 



I 1 1 — 

3 6 



"1 1 

12 15 

hour of day 



21 



24 



Figure 3. Temporal distribution of georeferenced Flickr photos and Twitter mi- 
croblogging posts. All data have been collected within the area of Germany. 



3. Correlation between Geospatial Context and Mo- 
bile Microblogging Contents 

Preliminary analyses show that a certain amount of mobile microblogging 
posts has a relation to the place where they have been generated. For exam- 
ple, posts related to public transport may be found near to train stations 
and posts related to movies may be found near to ci nemas. 

Kwak et al. (2010) show that the microblogging service Twitter has proper- 
ties of both, a social network and a news media. J ava et al. (2007) suggested 
a taxonomy of intentions of Twitter user microblogging posts: daily chatter, 
conversation, information sharing, news reporting. This indicates that 
communication plays a crucial role in Twitter usage. This implies that 
Tweets do not necessarily need to be influenced by or to be related to the 
location where the Twitter users post their messages. This needs to be con- 
sidered by any geographical analysis approach using Twitter data. 
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Figure 4. Density plots for the distribution of Twitter microblogging posts that 
related to the topic 'public transport' with regard to their distance to the nearest 
train station on logarithmic scale. Left: messages that are related to public 
transport, right: messages that are not related to public transport, top: results of 
manual classification, bottom: results of machine classification. 



In order to estimate how much mobile microblogging contents are influ- 
enced by nearby points of interest (POI), we use methods of natural lan- 
guage processi ng. As a f i rst exampl e we have devel oped and trai ned a M ax- 
imum Entropy classifier that distinguishes Tweets that are (not) related to 
public transport. This enables us to analyse the correlation between the 
location where mobile microblogging posts of this topic have been generat- 
ed and their closest train station, which may be assumed to be a strong fac- 
tor that triggers posts about the topic 'public transport'. Figure 4 shows a 
density plot of the according probability distribution for both the manually 
classified and the machine classified microblogging posts. It can be seen 
that there is a difference between the distributions of distances of Tweets 
that are (not) related to the topic of 'public transport'. Furthermore, it can 
be seen that the difference is also visible for the machine classification. I n 
general, mobile microblogging posts that are related to the topic 'public 
transport' tend to be nearer to the closest train station than posts that are 
not related to this topic. However, the effect is not as strong as it may be 
assumed. The reason for that may be that Twitter, to a certain extent, is a 
social network, whose contents is influenced by many factors of which geo- 
spatial context is only one. 
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